NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Applications of 3D Zernike Descriptors in Protein Structure Comparison

https://doi.org/10.2142/biophys.65.201

KAGAYA, Yuki; KIHARA, Daisuke (January 2025, Seibutsu Butsuri)

Full Text Available
AlphaFold model quality self‐assessment improvement via deep graph learning

https://doi.org/10.1002/pro.70274

Verburgt, Jacob; Zhang, Zicong; Kihara, Daisuke (September 2025, Protein Science)

Abstract In recent years, significant advancements have been made in deep learning‐based computational modeling of proteins, with DeepMind's AlphaFold2 standing out as a landmark achievement. These computationally modeled protein structures not only provide atomic coordinates but also include self‐confidence metrics to assess the relative quality of the modeling, either for individual residues or the entire protein. However, these self‐confidence scores are not always reliable; for instance, poorly modeled regions of a protein may sometimes be assigned high confidence. To address this limitation, we introduce Equivariant Quality Assessment Folding (EQAFold), an enhanced framework that refines the Local Distance Difference Test prediction head of AlphaFold to generate more accurate self‐confidence scores. Our results demonstrate that EQAFold outperforms the standard AlphaFold architecture and recent model quality assessment protocols in providing more reliable confidence metrics. Source code for EQAFold is available athttps://github.com/kiharalab/EQAFold_public.
more » « less
Full Text Available
GO2Sum: generating human-readable functional summary of proteins from GO terms

https://doi.org/10.1038/s41540-024-00358-0

Giri, Swagarika Jaharlal; Ibtehaz, Nabil; Kihara, Daisuke (December 2024, npj Systems Biology and Applications)

Abstract Understanding the biological functions of proteins is of fundamental importance in modern biology. To represent a function of proteins, Gene Ontology (GO), a controlled vocabulary, is frequently used, because it is easy to handle by computer programs avoiding open-ended text interpretation. Particularly, the majority of current protein function prediction methods rely on GO terms. However, the extensive list of GO terms that describe a protein function can pose challenges for biologists when it comes to interpretation. In response to this issue, we developed GO2Sum (Gene Ontology terms Summarizer), a model that takes a set of GO terms as input and generates a human-readable summary using the T5 large language model. GO2Sum was developed by fine-tuning T5 on GO term assignments and free-text function descriptions for UniProt entries, enabling it to recreate function descriptions by concatenating GO term descriptions. Our results demonstrated that GO2Sum significantly outperforms the original T5 model that was trained on the entire web corpus in generating Function, Subunit Structure, and Pathway paragraphs for UniProt entries.
more » « less
Full Text Available
Vox-UDA: Voxel-wise Unsupervised Domain Adaptation for Cryo-Electron Subtomogram Segmentation with Denoised Pseudo-Labeling

https://doi.org/10.1609/aaai.v39i1.32019

Li, Haoran; Li, Xingjian; Shi, Jiahua; Chen, Huaming; Du, Bo; Kihara, Daisuke; Barthelemy, Johan; Shen, Jun; Xu, Min (April 2025, Proceedings of the AAAI Conference on Artificial Intelligence)

Cryo-Electron Tomography (cryo-ET) is a 3D imaging technology that facilitates the study of macromolecular structures at near-atomic resolution. Recent volumetric segmentation approaches on cryo-ET images have drawn widespread interest in the biological sector. However, existing methods heavily rely on manually labeled data, which requires highly professional skills, thereby hindering the adoption of fully-supervised approaches for cryo-ET images. Some unsupervised domain adaptation (UDA) approaches have been designed to enhance the segmentation network performance using unlabeled data. However, applying these methods directly to cryo-ET image segmentation tasks remains challenging due to two main issues: 1) the source dataset, usually obtained through simulation, contains a fixed level of noise, while the target dataset, directly collected from raw-data from the real-world scenario, have unpredictable noise levels. 2) the source data used for training typically consists of known macromoleculars. In contrast, the target domain data are often unknown, causing the model to be biased towards those known macromolecules, leading to a domain shift problem. To address such challenges, in this work, we introduce a voxel-wise unsupervised domain adaptation approach, termed Vox-UDA, specifically for cryo-ET subtomogram segmentation. Vox-UDA incorporates a noise generation module to simulate target-like noises in the source dataset for cross-noise level adaptation. Additionally, we propose a denoised pseudo-labeling strategy based on the improved Bilateral Filter to alleviate the domain shift problem. More importantly, we construct the first UDA cryo-ET subtomogram segmentation benchmark on three experimental datasets. Extensive experimental results on multiple benchmarks and newly curated real-world datasets demonstrate the superiority of our proposed approach compared to state-of-the-art UDA methods.
more » « less
Full Text Available
Proteomic Analysis of Unicellular Cyanobacterium Crocosphaera subtropica ATCC 51142 under Extended Light or Dark Growth

https://doi.org/10.1021/acs.jproteome.4c00439

Panda, Punyatoya; Giri, Swagarika J; Sherman, Louis A; Kihara, Daisuke; Aryal, Uma K (February 2025, Journal of Proteome Research)

Full Text Available
Twenty years of advances in prediction of nucleic acid-binding residues in protein sequences

https://doi.org/10.1093/bib/bbaf016

Basu, Sushmita; Yu, Jing; Kihara, Daisuke; Kurgan, Lukasz (November 2024, Briefings in Bioinformatics)

Abstract Computational prediction of nucleic acid-binding residues in protein sequences is an active field of research, with over 80 methods that were released in the past 2 decades. We identify and discuss 87 sequence-based predictors that include dozens of recently published methods that are surveyed for the first time. We overview historical progress and examine multiple practical issues that include availability and impact of predictors, key features of their predictive models, and important aspects related to their training and assessment. We observe that the past decade has brought increased use of deep neural networks and protein language models, which contributed to substantial gains in the predictive performance. We also highlight advancements in vital and challenging issues that include cross-predictions between deoxyribonucleic acid (DNA)-binding and ribonucleic acid (RNA)-binding residues and targeting the two distinct sources of binding annotations, structure-based versus intrinsic disorder-based. The methods trained on the structure-annotated interactions tend to perform poorly on the disorder-annotated binding and vice versa, with only a few methods that target and perform well across both annotation types. The cross-predictions are a significant problem, with some predictors of DNA-binding or RNA-binding residues indiscriminately predicting interactions with both nucleic acid types. Moreover, we show that methods with web servers are cited substantially more than tools without implementation or with no longer working implementations, motivating the development and long-term maintenance of the web servers. We close by discussing future research directions that aim to drive further progress in this area.
more » « less
Full Text Available
Vox-UDA: Voxel-wise Unsupervised Domain Adaptation for Cryo-Electron Subtomogram Segmentation with Denoised Pseudo-Labeling

Li, Haoran; Li, Xingjian; Shi, Jiahua; Chen, Huaming; Du, Bo; Kihara, Daisuke; Barthelemy, Johan; Shen, Jun; Xu, Min (February 2025, AAAI Conference on Artificial Intelligence)

Full Text Available
Vox-UDA: Voxel-wise Unsupervised Domain Adaptation for Cryo-Electron Subtomogram Segmentation with Denoised Pseudo-Labeling

Li, Haoran; Li, Xingjian; Shi, Jiahua; Chen, Huaming; Du, Bo; Kihara, Daisuke; Barthelemy, Johan; Shen, Jun; Xu, Min (February 2025, The Thirty-Ninth AAAI Conference on Artificial Intelligence (AAAI-25))

Full Text Available
Vox-UDA: Voxel-wise Unsupervised Domain Adaptation for Cryo-Electron Subtomogram Segmentation with Denoised Pseudo-Labeling

https://doi.org/10.48448/3zgk-4y71

Barthelemy, Johan; Chen, Huaming; Du, Bo; Kihara, Daisuke; Li, Haoran; Li, Xingjian; Shen, Jun; Shi, Jiahua; Xu, Min (January 2025, Underline Science Inc.)

Cryo-Electron Tomography (cryo-ET) is a 3D imaging technology that facilitates the study of macromolecular structures at near-atomic resolution. Recent volumetric segmentation approaches on cryo-ET images have drawn widespread interest in the biological sector. However, existing methods heavily rely on manually labeled data, which requires highly professional skills, thereby hindering the adoption of fully-supervised approaches for cryo-ET images. Some unsupervised domain adaptation (UDA) approaches have been designed to enhance the segmentation network performance using unlabeled data. However, applying these methods directly to cryo-ET image segmentation tasks remains challenging due to two main issues: 1) the source dataset, usually obtained through simulation, contains a fixed level of noise, while the target dataset, directly collected from raw-data from the real-world scenario, have unpredictable noise levels. 2) the source data used for training typically consists of known macromoleculars. In contrast, the target domain data are often unknown, causing the model to be biased towards those known macromolecules, leading to a domain shift problem. To address such challenges, in this work, we introduce a voxel-wise unsupervised domain adaptation approach, termed Vox-UDA, specifically for cryo-ET subtomogram segmentation. Vox-UDA incorporates a noise generation module to simulate target-like noises in the source dataset for cross-noise level adaptation. Additionally, we propose a denoised pseudo-labeling strategy based on the improved Bilateral Filter to alleviate the domain shift problem. More importantly, we construct the first UDA cryo-ET subtomogram segmentation benchmark on three experimental datasets. Extensive experimental results on multiple benchmarks and newly curated real-world datasets demonstrate the superiority of our proposed approach compared to state-of-the-art UDA methods.
more » « less
Unveiling the stochastic nature of human heteropolymer ferritin self‐assembly mechanism

https://doi.org/10.1002/pro.5104

Bou‐Abdallah, Fadi; Fish, Jeremie; Terashi, Genki; Zhang, Yuanyuan; Kihara, Daisuke; Arosio, Paolo (August 2024, Protein Science)

Abstract Despite ferritin's critical role in regulating cellular and systemic iron levels, our understanding of the structure and assembly mechanism of isoferritins, discovered over eight decades ago, remains limited. Unveiling how the composition and molecular architecture of hetero‐oligomeric ferritins confer distinct functionality to isoferritins is essential to understanding how the structural intricacies of H and L subunits influence their interactions with cellular machinery. In this study, ferritin heteropolymers with specific H to L subunit ratios were synthesized using a uniquely engineered plasmid design, followed by high‐resolution cryo‐electron microscopy analysis and deep learning‐based amino acid modeling. Our structural examination revealed unique architectural features during the self‐assembly mechanism of heteropolymer ferritins and demonstrated a significant preference for H‐L heterodimer formation over H‐H or L‐L homodimers. Unexpectedly, while dimers seem essential building blocks in the protein self‐assembly process, the overall mechanism of ferritin self‐assembly is observed to proceed randomly through diverse pathways. The physiological significance of these findings is discussed including how ferritin microheterogeneity could represent a tissue‐specific adaptation process that imparts distinctive tissue‐specific functions to isoferritins.
more » « less
Full Text Available

« Prev Next »

Search for: All records